Skip to content

[CorrelatedValuePropagation] Fold calls to UCMP/SCMP when we know that ranges of operands do not overlap #97235

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 2, 2024

Conversation

Poseydon42
Copy link
Contributor

@Poseydon42 Poseydon42 commented Jun 30, 2024

This patch adds folds for calls to ucmp/scmp intrinsics where we can establish that the range of the first operand is strictly to the left or strictly to the right of the range of the second operand.

@llvmbot
Copy link
Member

llvmbot commented Jun 30, 2024

@llvm/pr-subscribers-llvm-transforms

Author: None (Poseydon42)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/97235.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp (+26)
  • (added) llvm/test/Transforms/CorrelatedValuePropagation/uscmp.ll (+118)
diff --git a/llvm/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp b/llvm/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp
index 88adeb597e755..853925bba0be1 100644
--- a/llvm/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp
+++ b/llvm/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp
@@ -548,6 +548,27 @@ static bool processAbsIntrinsic(IntrinsicInst *II, LazyValueInfo *LVI) {
   return false;
 }
 
+static bool processCmpIntrinsic(IntrinsicInst *II, LazyValueInfo *LVI) {
+  bool IsSigned = II->getIntrinsicID() == Intrinsic::scmp;
+  ConstantRange LHS_CR = LVI->getConstantRangeAtUse(II->getOperandUse(0),
+                                                    /*UndefAllowed*/ false);
+  ConstantRange RHS_CR = LVI->getConstantRangeAtUse(II->getOperandUse(1),
+                                                    /*UndefAllowed*/ false);
+
+  if (LHS_CR.icmp(IsSigned ? ICmpInst::ICMP_SGT : ICmpInst::ICMP_UGT, RHS_CR)) {
+    II->replaceAllUsesWith(ConstantInt::get(II->getType(), 1));
+    II->eraseFromParent();
+    return true;
+  }
+  if (LHS_CR.icmp(IsSigned ? ICmpInst::ICMP_SLT : ICmpInst::ICMP_ULT, RHS_CR)) {
+    II->replaceAllUsesWith(ConstantInt::getSigned(II->getType(), -1));
+    II->eraseFromParent();
+    return true;
+  }
+
+  return false;
+}
+
 // See if this min/max intrinsic always picks it's one specific operand.
 // If not, check whether we can canonicalize signed minmax into unsigned version
 static bool processMinMaxIntrinsic(MinMaxIntrinsic *MM, LazyValueInfo *LVI) {
@@ -639,6 +660,11 @@ static bool processCallSite(CallBase &CB, LazyValueInfo *LVI) {
     return processAbsIntrinsic(&cast<IntrinsicInst>(CB), LVI);
   }
 
+  if (CB.getIntrinsicID() == Intrinsic::scmp ||
+      CB.getIntrinsicID() == Intrinsic::ucmp) {
+    return processCmpIntrinsic(&cast<IntrinsicInst>(CB), LVI);
+  }
+
   if (auto *MM = dyn_cast<MinMaxIntrinsic>(&CB)) {
     return processMinMaxIntrinsic(MM, LVI);
   }
diff --git a/llvm/test/Transforms/CorrelatedValuePropagation/uscmp.ll b/llvm/test/Transforms/CorrelatedValuePropagation/uscmp.ll
new file mode 100644
index 0000000000000..7a3b815f64f12
--- /dev/null
+++ b/llvm/test/Transforms/CorrelatedValuePropagation/uscmp.ll
@@ -0,0 +1,118 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt < %s -passes=correlated-propagation -S | FileCheck %s
+
+; If nothing is known we can't change anything
+define i8 @ucmp_0(i32 %x, i32 %y) {
+  %1 = call i8 @llvm.ucmp(i32 %x, i32 %y)
+  ret i8 %1
+}
+
+define i8 @scmp_0(i32 %x, i32 %y) {
+  %1 = call i8 @llvm.scmp(i32 %x, i32 %y)
+  ret i8 %1
+}
+
+; If we know that range of LHS < range of RHS then return -1
+define i8 @ucmp_1(i32 %x, i32 %y) {
+  ; X is within [4, 8)
+  %cond1 = icmp uge i32 %x, 4
+  call void @llvm.assume(i1 %cond1)
+  %cond2 = icmp ult i32 %x, 8
+  call void @llvm.assume(i1 %cond2)
+  ; Y is within [8, +INF)
+  %cond3 = icmp uge i32 %y, 8
+  call void @llvm.assume(i1 %cond3)
+
+  %1 = call i8 @llvm.ucmp(i32 %x, i32 %y)
+  ret i8 %1
+}
+
+define i8 @scmp_1(i32 %x, i32 %y) {
+  ; X is within [-5, 3)
+  %cond1 = icmp sge i32 %x, -5
+  call void @llvm.assume(i1 %cond1)
+  %cond2 = icmp slt i32 %x, 3
+  call void @llvm.assume(i1 %cond2)
+  ; Y is within [3, +INF)
+  %cond3 = icmp sge i32 %y, 3
+  call void @llvm.assume(i1 %cond3)
+
+  %1 = call i8 @llvm.scmp(i32 %x, i32 %y)
+  ret i8 %1
+}
+
+; If we know that range of LHS > range of RHS then return 1
+define i8 @ucmp_2(i32 %x, i32 %y) {
+  ; X is within [4, +INF)
+  %cond1 = icmp uge i32 %x, 4
+  call void @llvm.assume(i1 %cond1)
+  ; Y is within [0, 4)
+  %cond2 = icmp ult i32 %y, 4
+  call void @llvm.assume(i1 %cond2)
+
+  %1 = call i8 @llvm.ucmp(i32 %x, i32 %y)
+  ret i8 %1
+}
+
+define i8 @scmp_2(i32 %x, i32 %y) {
+  ; X is within [4, +INF)
+  %cond1 = icmp sge i32 %x, 4
+  call void @llvm.assume(i1 %cond1)
+  ; Y is within [-INF, 4)
+  %cond2 = icmp slt i32 %y, 4
+  call void @llvm.assume(i1 %cond2)
+
+  %1 = call i8 @llvm.scmp(i32 %x, i32 %y)
+  ret i8 %1
+}
+
+; Negative test: ranges overlap
+define i8 @ucmp_3(i32 %x, i32 %y) {
+  ; X is within [4, +INF)
+  %cond1 = icmp uge i32 %x, 4
+  call void @llvm.assume(i1 %cond1)
+  ; Y is within [0, 6)
+  %cond2 = icmp ult i32 %y, 6
+  call void @llvm.assume(i1 %cond2)
+
+  %1 = call i8 @llvm.ucmp(i32 %x, i32 %y)
+  ret i8 %1
+}
+
+define i8 @scmp_3(i32 %x, i32 %y) {
+  ; X is within [2, +INF)
+  %cond1 = icmp sge i32 %x, 2
+  call void @llvm.assume(i1 %cond1)
+  ; Y is within [-INF, 4)
+  %cond2 = icmp slt i32 %y, 4
+  call void @llvm.assume(i1 %cond2)
+
+  %1 = call i8 @llvm.scmp(i32 %x, i32 %y)
+  ret i8 %1
+}
+
+; Negative test: mismatched signedness of range-establishing comparisons and
+;                of the intrinsic
+define i8 @ucmp_4(i32 %x, i32 %y) {
+  ; X is within [4, +INF)
+  %cond1 = icmp sge i32 %x, 4
+  call void @llvm.assume(i1 %cond1)
+  ; Y is within [0, 4)
+  %cond2 = icmp slt i32 %y, 4
+  call void @llvm.assume(i1 %cond2)
+
+  %1 = call i8 @llvm.ucmp(i32 %x, i32 %y)
+  ret i8 %1
+}
+
+define i8 @scmp_4(i32 %x, i32 %y) {
+  ; X is within [4, +INF)
+  %cond1 = icmp uge i32 %x, 4
+  call void @llvm.assume(i1 %cond1)
+  ; Y is within [0, 4)
+  %cond2 = icmp ult i32 %y, 4
+  call void @llvm.assume(i1 %cond2)
+
+  %1 = call i8 @llvm.scmp(i32 %x, i32 %y)
+  ret i8 %1
+}

@nikic nikic changed the title [ConstantValuePropagation] Fold calls to UCMP/SCMP when we know that ranges of operands do not overlap [CorrelatedValuePropagation] Fold calls to UCMP/SCMP when we know that ranges of operands do not overlap Jul 1, 2024
Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you!

@nikic nikic merged commit a616f57 into llvm:main Jul 2, 2024
7 checks passed
lravenclaw pushed a commit to lravenclaw/llvm-project that referenced this pull request Jul 3, 2024
…t ranges of operands do not overlap (llvm#97235)

This patch adds folds for calls to `ucmp`/`scmp` intrinsics where we can
establish that the range of the first operand is strictly to the left or
strictly to the right of the range of the second operand.
kbluck pushed a commit to kbluck/llvm-project that referenced this pull request Jul 6, 2024
…t ranges of operands do not overlap (llvm#97235)

This patch adds folds for calls to `ucmp`/`scmp` intrinsics where we can
establish that the range of the first operand is strictly to the left or
strictly to the right of the range of the second operand.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants